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We sequenced genomes from a -7,000 year old early farmer from Stuttgart in Germany, an 
-8,000 year old hunter- gatherer from Luxembourg, and seven -8,000 year old hunter- 
gatherers from southern Sweden. We analyzed these data together with other ancient 
genomes and 2,345 contemporary humans to show that the great majority of present-day 
Europeans derive from at least three highly differentiated populations: West European 
Hunter- Gatherers (WHG), who contributed ancestry to all Europeans but not to Near 
Easterners; Ancient North Eurasians (ANE), who were most closely related to Upper 
Paleolithic Siberians and contributed to both Europeans and Near Easterners; and Early 
European Farmers (EEF), who were mainly of Near Eastern origin but also harbored 
WHG-related ancestry. We model these populations' deep relationships and show that 
EEF had -44% ancestry from a "Basal Eurasian" lineage that split prior to the 
diversification of all other non- African lineages. 

Ancient DNA studies have demonstrated that migration played a major role in the introduction of 

1 2 

agriculture to Europe, as early farmers were genetically distinct from hunter-gatherers ' and 

2 3 

closer to present-day Near Easterners ' . Modelling the ancestry of present-day Europeans as a 
simple mixture of two ancestral populations , however, does not take into account their genetic 
affinity to an Ancient North Eurasian (ANE) population 4 ' 5 who also contributed genetically to 
Native Americans 6 . To better understand the deep ancestry of present-day Europeans, we 
sequenced nine ancient genomes that span the transition from hunting and gathering to 
agriculture in Europe (Fig. 1A; Extended Data Fig. 1): "Stuttgart" (19-fold coverage), a -7,000 
year old skeleton found in Germany in the context of artifacts from the first widespread Neolithic 
farming culture of central Europe, the Linearbandkeramik; "Loschbour" (22-fold coverage), an 
-8,000 year old skeleton from the Loschbour rock shelter in Heffingen, Luxembourg, discovered 
in the context of Mesolithic hunter-gatherer artifacts (SI1; SI2); and seven samples (0.01-2.4-fold 
coverage) from an -8,000 year old Mesolithic hunter-gatherer burial in Motala, Sweden. 

A central challenge is to show that DNA sequences retrieved from ancient samples are authentic 
and not due to present-day human contamination. The rate of C— >T and G— >A mismatches to the 
human genome at the ends of the molecules in libraries from each of the ancient samples exceeds 

7 8 

20%, a signature that suggests the DNA is largely ancient ' (SB). We inferred mitochondrial 
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DNA (mtDNA) consensus sequences, and based on the number of sites that differed, estimated 
contamination rates of 0.3% for Loschbour, 0.4% for Stuttgart, and 0.01%-5% for the Motala 
individuals (SB). We inferred similar levels of contamination for the nuclear DNA of Loschbour 
(0.4%) and Stuttgart (0.3%) using a maximum-likelihood-based test (SI3). The effective 
contamination rate for the high coverage samples is likely to be far lower, as consensus diploid 
genotype calling (SI2) tends to reduce the effects of a small fraction of contaminating reads. 

Stuttgart belongs to mtDNA haplogroup T2, typical of Neolithic Europeans 9 , while Loschbour 
and all Motala individuals belong to haplogroups U5 and U2, typical of pre-agricultural 

1 7 

Europeans ' (SI4). Based on the ratio of reads aligning to chromosomes X and Y, Stuttgart is 
female, while Loschbour and five of seven Motala individuals are male 10 (SI5). Loschbour and 
the four Motala males whose haplogroups we could determine all belong to Y-chromosome 
haplogroup I, suggesting that this was a predominant haplogroup in pre-agricultural northern 
Europeans analogous to mtDNA haplogroup U 11 (SI5). 

We carried out most of our sequencing on libraries prepared in the presence of uracil DNA 
glycosylase (UDG), which reduces C— >T and G— >A errors due to ancient DNA damage (SI3). 
We first confirm that the ancient samples had statistically indistinguishable levels of Neandertal 
ancestry to each other (-2%) and to present-day Eurasians (SI6), and so we do not consider this 
further in our analyses of population relationships. We report analyses that leverage the type of 
information that can only be obtained from deep coverage genomes, mostly focusing on 
Loschbour and Stuttgart, and for some analyses also including Motalal2 (2.4x) and La Brana 

12 

from Mesolithic Iberia (3.4x) . Heterozygosity, the number of differences per nucleotide 
between an individual's two chromosomes, is 0.00074 for Stuttgart, at the high end of present- 
day Europeans, and 0.00048 for Loschbour, lower than in any present-day humans (SI2). 
Through comparison of Loschbour's two chromosomes we find that this low diversity is not due 
to recent inbreeding but instead due to a population bottleneck in this individual's more distant 
ancestors (Extended Data Fig. 2). Regarding alleles that affect phenotype, we find that the AMY1 

12 

gene coding for salivary amylase had 5, 6, 13, and 16 copies in La Brana , Motalal2, Loschbour 
and Stuttgart respectively; these numbers are within the range of present-day Europeans (SI7), 
suggesting that high copy counts of AMY1 are not entirely due to selection since the switch to 
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1 3 

agriculture . The genotypes at SNPs associated with lactase persistence indicate that Stuttgart, 
Loschbour, and Motalal2 were unable to digest milk as adults. Both Loschbour and Stuttgart 
likely had dark hair (>99% probability); Loschbour, like La Brana and Motalal2, likely had blue 
or intermediate-colored eyes (>75% probability), while Stuttgart most likely had brown eyes 
(>99% probability) (SI8). Neither Loschbour nor La Brana carries the skin-lightening allele in 
SLC24A5 that is homozygous in Stuttgart and nearly fixed in Europeans today, indicating that 

12 

they probably had darker skin . However, Motalal2 carries at least one copy of the derived 
allele, indicating that this locus was already polymorphic in Europeans prior to the advent of 
agriculture. 

To place the ancient European genomes in the context of present-day human genetic variation, 
we assembled a dataset of 2,345 present-day humans from 203 populations genotyped at 594,924 
autosomal single nucleotide polymorphisms (SNPs) 5 (SI9) (Extended Data Table 1). We used 
ADMIXTURE 14 to identify 59 "West Eurasian" populations (777 individuals) that cluster with 
Europe and the Near East (SI9 and Extended Data Fig. 3). Principal component analysis (PCA) 15 
(SI10) (Fig. IB) reveals a discontinuity between the Near East and Europe, with each showing 
north-south clines bridged only by a few populations of mainly Mediterranean origin. Our PCA 
differs from previous studies that showed a correlation with the map of Europe 16 ' 17 , which we 
determined is due to our study having relatively fewer central and northwestern Europeans, and 

1 8 

more Near Easterners and eastern Europeans (SI10). We projected the newly sequenced and 
previously published 2 ' 6 ' 12 ' 19 ancient genomes onto the first two PCs inferred from present-day 
samples (Fig. IB). MAI and AG2, both Upper Paleolithic hunter-gatherers from Lake Baikal 6 in 
Siberia, project at the northern end of the PCA, suggesting an "Ancient North Eurasian" meta- 
population (ANE). European hunter- gatherers from Spain, Luxembourg, and Sweden fall outside 
the genetic variation of West Eurasians in the direction of European differentiation from the Near 
East, with a "West European Hunter- Gatherer" (WHG) cluster including Loschbour and La 

12 

Brana , and a "Scandinavian Hunter- Gatherer" (SHG) cluster including the Motala individuals 
and -5,000 year old hunter- gatherers from the Swedish Pitted Ware Culture . An "Early 
European Farmer" (EEF) cluster includes Stuttgart, the -5,300 year old Tyrolean Iceman 19 and a 
-5,000 year old southern Swedish farmer 2 , and is near present-day Sardinians 2 ' 19 . 
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PCA gradients of genetic variation may arise under very different histories . To test if they 
reflect population mixture events or are entirely due to genetic drift within West Eurasia, we 

1 8 

computed an/^-statistic that tests whether the ancient MAI from Siberia shares more alleles 
with a Test West Eurasian population or with Stuttgart. We find that f^Test, Stuttgart; MAI, 
Chimp) is positive for many West Eurasians, which must be due to variable degrees of admixture 
with ancient populations related to MAI (Extended Data Fig. 4). We also find that f^Test, 
Stuttgart; Loschbour, Chimp) is nearly always positive in Europeans and always negative in 
Near Easterners, indicating that Europeans have more ancestry from ancient populations related 
to Loschbour than do Near Easterners (Extended Data Fig. 4). To investigate systematically the 
history of population mixture in West Eurasia, we computed all possible statistics of the form 
fs(X; Refi, Refi) (SI1 1). An/j-statistic is expected to be positive if no admixture has taken place, 
but if X is admixed between populations related to Refi and Ref2, it can be negative 5 . We tested 
all possible pairs of Refi, Ref2 chosen from the list of 192 present-day populations with at least 
four individuals, and five ancient genomes (Sill). The lowest f$- statistics for Europeans are 
usually negative (93% are >4 standard errors below zero using a standard error from a block 
jackknife 5 ' 21 ). The most negative statistic (Table 1) always involves at least one ancient 
individual as a reference, and for Europeans it is nearly always significantly lower than the most 
negative statistic obtained using only present-day populations as references (SI1 1). MAI is a 
better surrogate (Extended Data Fig. 5) for Ancient North Eurasian ancestry than the Native 
American Karitiana who were first used to represent this component of ancestry in Europe 4 ' 5 . 
Motalal2 never appears as one of the references, suggesting that SHG may not be a source for 
Europeans. Instead, present-day European populations usually have their lowest/? with either the 
(EEF, ANE) or (WHG, Near East) pair (SI1 1, Extended Data Table 1). For Near Easterners, the 
lowest /r statistic always takes as references Stuttgart and a population from Africa, the 
Americas, South Asia, or MAI (Table 1), reflecting the fact that both Stuttgart and present-day 
Near Easterners harbor ancestry from ancient Near Easterners. Extended Data Fig. 6 plots 
statistics of the form f^West Eurasian X, Chimp; Ancient i, Ancienti) onto a map, showing strong 
gradients in the relatedness to Stuttgart (EEF), Loschbour (WHG) and MAI (ANE). 

We determined formally that a minimum of three source ancestral populations are needed to 
explain the data for many European populations taken together by studying the correlation 
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patterns of all possible statistics of the form f 4 (Testb ase , Test t ; Obase, Oj) (SI12). Here Test base is a 
reference European population and Testj the set of all other European Test populations; Obase is a 
reference outgroup population, and 0\ the set of other outgroups (ancient DNA samples, Onge, 
Karitiana, and Mbuti). The rank of the (i, j) matrix reflects the minimum number of source 

22 23 

populations that contributed to the Test populations ' . For a pool of 23 Test populations 
comprising most present-day Europeans, this analysis rejects descent from just two sources 

1 2 23 

(P<10" by a Hotelling T-test ). However, three source populations are consistent with the data 
after excluding the Spanish who have evidence for African admixture 24 " 26 (P=0.019, not 
significant after multiple-hypothesis correction). Our finding of at least three source populations 
is also qualitatively consistent with the results from ADMIXTURE (SI9), PCA (Fig. IB, SI10) 
and /-statistics (Extended Data Table 1, Extended Data Fig. 6, Sill, SI12). We caution that the 
finding of three sources could be consistent with a larger number of mixture events, as the 
method cannot distinguish between one or more mixture events if they are from the same set of 
sources. Our analysis also does not assume that the inferred source populations were themselves 
unadmixed; indeed, the positive f4 Stuttgart, X; Loschbour, Chimp) statistics obtained when X is 
a Near Eastern population (Extended Data Table 1) implies that EEF had some WHG-related 
ancestry, which we show in SI13 was at least 0% and less than 45%. 

Motivated by the evidence of at least three source populations for present-day Europeans, we set 
out to develop a model consistent with our data. To constrain our search space for modeling, we 
first studied /r statistics comparing the ancient individuals from Europe and Siberia and diverse 
eastern non- African groups (Oceanians, East Asians, Siberians, Native Americans, and Onge 
from the Andaman Islands 27 ) (SI14). We find that: (1) Loschbour (WHG) and Stuttgart (EEF) 
share more alleles with each other than either does with MAI (ANE), as might be expected by 
geography, but MAI shares more alleles with Loschbour than with Stuttgart, indicating a link 
between Eurasian hunter- gatherers to the exclusion of European farmers; (2) Eastern non- 
Africans share more alleles with Eurasian hunter-gatherers (MAI, Loschbour, La Brana, and 
Motalal2) than with Stuttgart; (3) Every eastern non- African population except for Native 
Americans and Siberians is equally closely related to diverse Eurasian hunter-gatherers, but 
Native Americans and Siberians share more alleles with MAI than with European hunter- 
gatherers; and (4) Eurasian hunter-gatherers and Stuttgart both share more alleles with Native 
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1 8 

Americans than with other eastern non- Africans. We use the ADMIXTUREGRAPH software 
to search for a model of population relationships (a tree structure augmented by admixture 
events) that is consistent with these observations. We explored models with 0, 1, or 2 admixture 
events in the ancestry of the three ancient source populations and eastern non-Africans, and 
identified a single model with two admixture events that fit the data. The successful model (Fig. 
2A) includes the previously reported gene flow into Native Americans from an MAI -like 
population 6 , as well as the novel inference that Stuttgart is partially (44 + 10%) derived from a 
"Basal Eurasian" lineage that split prior to the separation of eastern non-Africans from the 
common ancestor of WHG and ANE. If this model is accurate, the ANE/WHG split must have 
occurred >24,000 years ago since this is the age 6 of MAI and this individual is on the ANE 
lineage. The WHG must then have split from eastern non- Africans >40,000 years ago, as this is 
the age of the Chinese Tianyuan sample which clusters with eastern non-Africans to the 

28 

exclusion of Europeans . The Basal Eurasian split would then have to be even older. A Basal 
Eurasian lineage in the Near East is plausible given the presence of anatomically modern humans 
in the Levant -100 thousand years ago and African-related tools likely made by modern 

30 31 32 

humans in Arabia ' . Alternatively, evidence for gene flow between the Near East and Africa* , 

33 

and African morphology in pre-farming Natufians from Israel, may also be consistent with the 
population representing a later movement of humans out of Africa and into the Near East. 

We tested the robustness of the ADMIXTUREGRAPH model in various ways. First, we verified 
that Stuttgart and the Iceman (EEF), and Loschbour and LaBrana (WHG) can be formally fit as 
clades (SI14). We also used the unsupervised MixMapper 4 (SI15) and TreeMix 34 software (SI16) 
to fit graph models; both found all the same admixture events. The statistics supporting our key 
inferences about history also provide consistent results when restricted to transversions 
polymorphisms not affected by ancient DNA damage, and when repeated with whole-genome 

35 

sequencing data that is not affected by SNP ascertainment bias* (Extended Data Table 2). 

We next fit present-day European populations into our working model. We found that few 
European populations could be fit as 2-way mixtures, but nearly all were compatible with being 
3-way mixtures of ANE/EEF/WHG (SI14). Mixture proportions (Fig. 2B; Extended Data Table 
3) inferred via our model are consistent with those from an independent method that relates 
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European populations to diverse outgroups using/4-statistics while making much weaker 
modeling assumptions (only assuming that MAI is an unmixed descendent of ANE, Loschbour 
of WHG, and Stuttgart of EEF; SI17). These analyses allow us to infer that EEF ancestry in 
Europe today ranges from -30% in the Baltic region to -90% in the Mediterranean, a gradient 
that is also consistent with patterns of identity-by-descent (IBD) sharing 36 (SI18) and 

37 

chromosome painting (SI19) in which Loschbour shares more segments with northern 
Europeans and Stuttgart with southern Europeans. Our estimates suggest that Southern 
Europeans inherited their European hunter-gatherer ancestry mostly via EEF ancestors (Extended 
Data Fig. 6), while Northern Europeans acquired up to 50% additional WHG ancestry. 
Europeans have a larger proportion of WHG than ANE ancestry (WHG/(WHG+ANE) = 0.6-0.8) 
with the ANE ancestry never being larger than -20%. (By contrast, in the Near East there is no 
detectible WHG ancestry, but substantial ANE ancestry, up to -29% in the North Caucasus) 
(SI14). While ANE ancestry was not as pervasive in Europe during the agricultural transition as 
it is today (we do not detect it in either Loschbour or Stuttgart), it was already present, since 
MAI shares more alleles with Motalal2 (SHG) than with Loschbour, and Motalal2 fits as a 
mixture of 81% WHG and 19% ANE (SI14). 

Two sets of European populations are poor fits. Sicilians, Maltese, and Ashkenazi Jews have 
EEF estimates beyond the 0-100% interval (SI17) and cannot be jointly fit with other Europeans 
(SI14). These populations may have more Near Eastern ancestry than can be explained via EEF 
admixture (SI14), consistent with their falling in the gap between European and Near Eastern 
populations in Fig. IB. Finns, Mordovians and Russians from northeastern Europe also do not fit 
(SI14; Extended Data Table 3). To better understand this, we plotted f4X, Bedouin2; Han, 
Mbuti) against f^X, Bedouin2; MAI, Mbuti). These statistics measure the degree of a European 
population's allele sharing with Han Chinese or MAI (Extended Data Fig. 7). Europeans fall on 
a line of slope >1 in the plot of these two statistics. However, northeastern Europeans including 
Chuvash and Saami (which we add in to the analysis) fall away from this line in the direction of 
East Asians. This is consistent with East Asian (most likely Siberian) gene flow into northeastern 
Europeans, some of which may be more recent than the original ANE admixture (SI14). 
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Three questions seem particularly important to address in follow-up work. Where did the EEF 
obtain their WHG ancestry? Southeastern Europe is a candidate as it lies along the path from 
Anatolia into central Europe^ . When and where the ancestors of present-day Europeans first 
acquire their ANE ancestry? Based on discontinuity in mtDNA haplogroup frequencies, this may 
have occurred -5,500-4,000 years ago 40 in Central Europe. When and where did Basal Eurasians 
mix into the ancestors of the EEF? An important aim for future work should be to collect DNA 
from additional ancient samples to illuminate these transformations. 

Methods Summary 

We extracted DNA from nine sets of ancient human remains and converted the extracts into 
Illumina sequencing libraries in dedicated clean rooms. We assessed whether sequences for these 
libraries were consistent with genuine ancient DNA by searching for characteristic deaminations 

7 8 

at the ends of molecules ' . We also tested for contamination by searching for evidence of 
mixture of DNA from multiple individuals. For large-scale shotgun sequencing we used libraries 
that we made in the presence of the enzymes Uracil-DNA-glycosylase and endonuclease VIII, 
which reduce the rate of ancient DNA-induced errors. After removal of duplicated molecules, we 
called consensus genotypes for the high coverage samples using the Genome Analysis Toolkit 41 . 
We merged the data with published ancient genomes, as well as with 2,345 present-day humans 
from 203 populations genotyped at 594,924 autosomal single nucleotide polymorphisms. We 
visualized population structure using Principal Component Analysis 15 and ADMIXTURE 14 . To 
make inferences about population history, we used methods that can analyze allele frequency 
correlation statistics to detect population mixture 5 ; that can estimate mixture proportions in the 
absence of accurate ancestral populations; that can infer the minimum number of source 
populations for a collection of tests population ; and that can assess formally the fit of genetic 
data to models of population history 5 . 
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Table 1: Lowest /^-statistics for each West Eurasian population 



Rpfi 




Tsii*0pt fYii* whiph thp^p twn t*pfpr*pn<*p^ oivp flip Inwp^t f?(\* Rpfi Rpf~*\ 


WHG 


EEF 


Sardinian 


WHG 


Near East 


Basque, Belarusian, Czech, English, Estonian, Finnish, French_South, 
Icelandic, Lithuanian, Mordovian, Norwegian, Orcadian, Scottish, 
Spanish, Spanish_North, Ukrainian 


EEF 


ANE 


Abkhasian **, Albanian, Ashkenazi_Jew ****, Bergamo, Bulgarian, 
Chechen , Croatian, Cypriot , Druze , French, Greek, Hungarian, 
Lezgin, MAI, Maltese, Sicilian, Turkish_Jew, Tuscan 


EEF 


Native 
American 


Adygei, Balkar, Iranian, Kumyk, North_Ossetian, Turkish 


EEF 


African 


BedouinA, BedouinB|, Jordanian, Lebanese, Libyan_Jew, 
Moroccan_Jew, Palestinian, Saudi' , Syrian, Tunisianjew " , 
Yemenite_Jew 


EEF 


South 
Asian 


Armenian, Georgian , Georgian_Jew , Iranian_Jew , Iraqi_Jew 



Note: WHG = Loschbour or LaBrana; EEF=Stuttgart; ANE=MA1; Native American=Piapoco; 
African=Esan, Gambian, or Kgalagadi; South Asian=GujaratiC or Vishwabrahmin. Statistics are 
negative with Z<-4 unless otherwise noted: f (positive) or , x '\ * , to indicate Z less than 0, 
-1,-2, and -3 respectively. The complete list of statistics can be found in Extended Data Table 1. 
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Figure Legends 

Figure 1: Map of West Eurasian populations and Principal Component Analysis, (a) 

Geographical locations of ancient and present-day samples, with color coding matching the PCA. 
We show all sampling locations for each population, which results in multiple points for some 
populations (e.g., Spain), (b) PCA on all present-day West Eurasians, with the ancient and 
selected eastern non- Africans projected. European hunter-gatherers fall beyond present-day 
Europeans in the direction of European differentiation from the Near East. Stuttgart clusters with 
other Neolithic Europeans and present-day Sardinians. MAI falls outside the variation of 
present-day West Eurasians in the direction of southern-northern differentiation along dimension 
2 and between the European and Near Eastern clines along dimension 1 . 

Figure 2: Modeling of West Eurasian population history, (a) A three-way mixture model that 
is a statistical fit to the data for many European populations, ancient DNA samples, and non- 
European populations. Present-day samples are colored in blue, ancient samples in red, and 
reconstructed ancestral populations in green. Solid lines represent descent without mixture, and 
dashed lines represent admixture events. For the two mixture events relating the highly divergent 
ancestral populations, we print estimates for the mixture proportions as well as one standard 
error, (b) We plot the proportions of ancestry from each of three inferred ancestral populations 
(EEF, ANE and WHG) as inferred from the model-based analysis. 
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Methods 

Archeological context, sampling and DNA extraction 

The Loschbour sample stems from a male skeleton excavated in 1935 at the Loschbour rock 
shelter in Heffingen, Luxembourg. The skeleton was AMS radiocarbon dated to 7,205 + 50 years 
before present (OxA-7738; 6,220-5,990 cal BC) 43 . At the Palaeogenetics Laboratory in Mainz, 
material for DNA extraction was sampled from a molar (M48) after irradiation with UV-light, 
surface removal, and pulverization in a mixer mill. DNA extraction took place in the 
palaeogenetics facilities in the Institute for Archaeological Sciences at the University of 
Tubingen. Three extracts were made in total, one from 80 mg of powder using an established 
silica based protocol 44 and two additional extracts from 90 mg of powder each with a protocol 
optimized for the recovery of short DNA molecules 45 . 

The Stuttgart sample was taken from a female skeleton excavated in 1982 at the site 
Viesenhauser Hof, Stuttgart-Miihlhausen, Germany. It was attributed to the Linearbandkeramik 
(5,500-4,800 BC) through associated pottery artifacts and the chronology was corroborated by 
radiocarbon dating of the stratigraphy 46 . Both sampling and DNA extraction took place in the 
Institute for Archaeological Sciences at the University of Tubingen. The M47 molar was 
removed and material from the inner part was sampled with a sterile dentistry drill. An extract 
was made using 40 mg of bone powder 45 . 

The Motala individuals were recovered from the site of Kanaljorden in the town of Motala, 
Ostergotland, Sweden, excavated between 2009 and 2013. The human remains at this site are 
represented by several adult skulls and one infant skeleton. All individuals are part of a ritual 
deposition at the bottom of a small lake. Direct radiocarbon dates on the remains range between 
7,013 + 76 and 6,701 + 64 BP (6,361-5,516 cal BC), corresponding to the late Middle Mesolithic 
of Scandinavia. Samples were taken from the teeth of the nine best preserved skulls, as well as a 
femur and tibia. Bone powder was removed from the inner parts of the teeth or bones with a 
sterile dentistry drill. DNA from 100 mg of bone powder was extracted 47 in the ancient DNA 
laboratory of the Archaeological Research Laboratory, Stockholm. 
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Library preparation 

Illumina sequencing libraries were prepared using either double- or single-stranded library 
preparation protocols 48 ' 49 (SI1). For high-coverage shotgun sequencing libraries, a DNA repair 
step with Uracil-DNA-glycosylase (UDG) and endonuclease VIII (endo VIII) treatment was 
included in order to remove uracil residues 50 . Size fractionation on a PAGE gel was also 
performed in order to remove longer DNA molecules that are more likely to be contaminants 49 . 
Positive and blank controls were carried along during every step of library preparation. 

Shotgun sequencing and read processing 

All non-UDG-treated libraries were sequenced either on an Illumina Genome Analyzer IIx with 
2x76 + 7 cycles for the Loschbour and Motala libraries, or on an Illumina MiSeq with 2x150 + 8 
+ 8 cycles for the Stuttgart library. We followed the manufacturer's protocol for multiplex 
sequencing. Raw overlapping forward and reverse reads were merged and filtered for quality 51 
and mapped to the human reference genome (hgl9/GRCh37/1000Genomes) using the Burrows- 
Wheeler Aligner (BWA) (SI2). For deeper sequencing, UDG-treated libraries of Loschbour 
were sequenced on 3 Illumina HiSeq 2000 lanes with 50-bp single-end reads, 8 Illumina HiSeq 
2000 lanes of 100-bp paired-end reads and 8 Illumina HiSeq 2500 lanes of 101-bp paired-end 
reads. The UDG-treated library for Stuttgart was sequenced on 8 HiSeq 2000 lanes and 101-bp 
paired-end reads. The UDG-treated libraries for Motala were sequenced on 8 HiSeq 2000 lanes 
of 100-bp paired-end reads, with 4 lanes each for two pools (one of 3 individuals and one of 4 
individuals). We also sequenced an additional 8 HiSeq 2000 lanes for Motalal2, the Motala 
sample with the highest percentage of endogenous human DNA. 

Enrichment of mitochondrial DNA and sequencing 

Non-UDG-treated libraries of Loschbour and all Motala samples were enriched for human 
mitochondrial DNA using a bead-based capture approach with present-day human DNA as bait 
to test for DNA preservation and mtDNA contamination. UDG-treatment was omitted in order to 

o 

allow characterization of damage patterns typical for ancient DNA . The captured libraries were 
sequenced on an llumina Genome Analyzer IIx platform with 2 x 76 + 7 cycles and the resulting 
reads were merged and quality filtered 51 . The sequences were mapped to the Reconstructed 
Sapiens Reference Sequence, RSRS 54 , using a custom iterative mapping assembler, MIA 55 (SI4). 
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Contamination estimates 

We assessed if the sequences had the characteristics of authentic ancient DNA using four 
approaches. First we searched for evidence of contamination by determining whether the 
sequences mapping to the mitochondrial genome were consistent with deriving from more than 
one individual 55 ' 56 . Second, for the high-coverage Loschbour and Stuttgart genomes, we used a 
maximum-likelihood-based estimate of autosomal contamination that uses variation at sites that 

57 

are fixed in the 1000 Genomes data to estimate error, heterozygosity and contamination 
simultaneously. Third, we estimated contamination based on the rate of polymorphic sites on the 

CO 

X chromosome of the male Loschbour individual (SB) Fourth, we analyzed non-UDG treated 
reads mapping to the RSRS to search for aDNA-typical damage patterns resulting in C— >T 
changes at the 5'-end of the molecule (SB). 

Phylogenetic analysis of the mitochondrial genomes 

All nine complete mitochondrial genomes that fulfilled the criteria of authenticity were assigned 
to haplogroups using Haplofind 59 . A Maximum Parsimony tree including present day humans 
and previously published ancient mtDNA sequences was generated with MEGA 60 . The effect of 
branch shortening due to a lower number of substitutions in ancient lineages was studied by 
calculating the nucleotide edit distance to the root for all haplogroup R sequences (SI4). 

Sex Determination and Y-chromosome Analysis 

We assessed the sex of all sequenced individuals by using the ratio of (chrY) to (chrY+chrX) 
aligned reads 10 . We downloaded a list of Y-chromosome SNPs curated by the International 
Society of Genetic Genealogy (ISOGG, http://www.isogg.org) v. 9.22 (accessed Feb. 18, 2014) 
and determined the state of the ancient individuals at positions where a single allele was 
observed and MAPQ>30. We excluded C/G or A/T SNPs due to uncertainty about the polarity of 
the mutation in the database. The ancient individuals were assigned haplogroups based on their 
derived state (SI5). We also used BEAST vl.7.51 61 to assess the phylogenetic position of 
Loschbour using 623 males from around the world with 2,799 variant sites across 500kb of non- 
recombining Y-chromosome sequence 62 (SI5). 
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Estimation of Neandertal admixture 

We estimate Neandertal admixture in ancient individuals with the /^-ratio or 5-statistic 5 ' 63 ' 64 
a = f A (Altai, Denis ova; Test, Yorubd)/ f 4 (Altai, Denisova; Vindija, Yoruba) which 
uses whole genome data from Altai, a high coverage (52x) Neanderthal genome sequence , 
Denisova, a high coverage sequence 49 from another archaic human population (31x), and 
Vindija, a low coverage (1.3x) Neanderthal genome from a mixture of three Neanderthal 
individuals from Vindija Cave in Croatia 63 . 

Inference of demographic history and inbreeding 

We used the Pairwise Sequentially Markovian Coalescent (PSMC) 65 to infer the size of the 
ancestral population of Stuttgart and Loschbour. This analysis requires high quality diploid 
genotype calls and cannot be performed in the low-coverage Motala samples. To determine 
whether the low effective population size inferred for Loschbour is due to recent inbreeding, we 
plotted the time-to-most-recent common ancestor (TMRCA) along each of chrl-22 to detect runs 
of low TMRCA. 

Analysis of segmental duplications and copy number variants 

We built read-depth based copy number maps for the Loschbour, Stuttgart and Motala 12 
genomes in addition to the Denisova and Altai Neanderthal genome and 25 deeply sequenced 

35 

modern genomes (SI7). We built these maps by aligning reads, subdivided into their non- 
overlapping 36-bp constituents, against the reference genome using the mrsFAST aligner 66 , and 
renormalizing read-depth for local GC content. We estimated copy numbers in windows of 500 
unmasked base pairs slid at 100 bp intervals across the genome. We called copy number variants 
using a scale space filter algorithm. We genotyped variants of interest and compared the 
genotypes to those from individuals sequenced as part of the 1000 Genomes Project 67 . 

Phenotypic inference 

We inferred likely phenotypes (SI8) by analyzing DNA polymorphism data in the VCF format 68 
using VCFtools (http://vcftoools.sourceforge.net/). For the Loschbour and Stuttgart individuals, 
we included data from sites not flagged as LowQuality, with genotype quality (GQ) of >30, and 
SNP quality (QUAL) of >50. For Motalal2, which is of lower coverage, we included sites 
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having at least 2x coverage and passed visual inspection of the local alignment using samtools 
tview (http://samtools.sourceforge.net) 69 

Human Origins dataset curation 

The Human Origins array consists of 14 panels of SNPs for which the ascertainment is well 

5 70 

known ' . All population genetics analysis were carried out on a set of 594,924 autosomal SNPs, 
after restricting to sites that had >90% completeness across 7 different batches of sequencing, 
and that had >97.5% concordance with at least one of two subsets of samples for which whole 
genome sequencing data was also available. The total dataset consists of 2,722 individuals, 
which we filtered to 2,345 individuals (203 populations) after removing outlier individuals or 
relatives based on visual inspection of PCA plots 15 ' 71 or model-based clustering analysis 14 . 
Whole genome amplified (WGA) individuals were not used in analysis, except for a Saami 
individual who we forced in because of the special interest of this population for Northeastern 
European population history (Extended Data Fig. 7). 

ADMIXTURE analysis 

We merged all Human Origins genotype data with whole genome sequencing data from 
Loschbour, Stuttgart, MAI, Motalal2, Motala_merge, and LaBrana. We then thinned the 

72 

resulting dataset to remove SNPs in linkage-disequilibrium with PLINK 1.07 , using a window 
size of 200 SNPs advanced by 25 SNPs and an r 2 threshold of 0.4. We ran ADMIXTURE 
1.23 14 ' 73 for 100 replicates with different starting random seeds, default 5-fold cross-validation, 
and varying the number of ancestral populations K between 2 and 20. We assessed clustering 
quality using CLUMPP 74 . We used the ADMIXTURE results to identify a set of 59 "West 
Eurasian" (European/Near Eastern) populations based on values of a "West Eurasian" ancestral 
population at K=3 (SI9). We also identified 15 populations for use as "non-West Eurasian 
outgroups" based on their having at least 10 individuals and no evidence of European or Near 
Eastern admixture at K=ll, the lowest K for which Near Eastern/European-maximized ancestral 
populations appeared consistently across all 100 replicates. 

Principal Components Analysis 
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15 71 75 

We used smartpca (version: 10210) from EIGENSOFT ' 5.0.1 to carry out Principal 
Components Analysis (PCA) (SI10). We performed PCA on a subset on individuals and then 
projected others using the Isqproject: YES option that gives an unbiased inference of the position 
of samples even in the presence of missing data (especially important for ancient DNA). 

/j-statistics 

We use the f 3 - statistic 5 f 3 (Test; Ref 1 ,Ref 2 ) = ^£f =1 (tj — r t i) {t t — r 2i i), where % and r 2 ,i 

are the allele frequencies for the i SNP in populations Test, Ref\, Ref2, respectively, to 
determine if there is evidence that the Test population is derived from admixture of populations 
related to Refi and Refi (Sill). A significantly negative statistic provides unambiguous evidence 
of mixture in the Test population 5 . We allow Refi and Refi to be any Human Origins population 
with 4 or more individuals, or Loschbour, Stuttgart, MAI, Motalal2, LaBrana. We assess 

2 1 

significance of the /^-statistics using a block jackknife and a block size of 5cM. We report 
significance as the number of standard errors by which the statistic differs from zero (Z- score). 
We also perform an analysis in which we constrain the reference populations to be (i) EEF 
(Stuttgart) and WHG (Loschbour or LaBrana), (ii) EEF and a Near Eastern population, (iii) EEF 
and ANE (MAI), or (iv) any two present-day populations, and compute a Zdiff score between the 
lowest /^-statistic observed in the dataset, and the f$- statistic observed for the specified pair. 

/^-statistics 

We analyze /^-statistics 5 of the form f 4 (A,B; C,D) = ^2ili( a t — b{) (q — d £ ) to assess if 

populations A, B are consistent with forming a clade in an unrooted tree with respect to C, D. If 
they form a clade, the allele frequency differences between the two pairs should be uncorrelated 
and the statistic has an expected value of 0. We set the outgroup D to be a sub-Saharan African 
population or Chimpanzee. We systematically tried all possible combinations of the ancient 
samples or 15 "non-West Eurasian outgroups" identified by ADMIXTURE analysis as A, B, C 
to determine their genetic affinities (SI 14). Setting A as a present-day test population and B as 
either Stuttgart or BedouinB, we documented relatedness to C=(Loschbour or MAI) or 
C=(MA1 and Karitiana) or C=(MA1 or Han) (Extended Data Figs. 4, 5, 7). Setting C as a test 
population and (A, B) a pair from (Loschbour, Stuttgart, MAI) we documented differential 
relatedness to ancient populations (Extended Data Fig. 6). We computed D-statistics 63 using 
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transversion polymorphisms in whole genome sequence data^ to confirm robustness to 
ascertainment and ancient DNA damage (Extended Data Table 2). 

Minimum number of source populations for Europeans 

22 23 

We used qpWave ' to study the minimum number of source populations for a designated set of 
Europeans (SI12). We use /r statistics of the formXf/, r) =f4lo, I; ro, r) where Iq,yq are arbitrarily 
chosen "base" populations, and /, r are other populations from two sets L and R respectively. If 
X(l, r) has rank r and there were n waves of immigration into R with no back-migration from R to 
L, then r+1 < n. We set L to include Stuttgart, Loschbour, MAI, Onge, Karitiana, Mbuti and R to 
include 23 modern European populations who fit the model of SI14 and had admixture 
proportions within the interval [0,1] for the method with minimal modeling assumptions (SI 17). 

Admixture proportions for Stuttgart in the absence of a Near Eastern ancient genome 

We used Loschbour and BedouinB as surrogates for "Unknown hunter- gatherer" and Near 
Eastern (NE) farmer populations that contributed to Stuttgart (SI13). Ancient Near Eastern 
ancestry in Stuttgart is estimated by the /4-ratio 5 ' 18 f^Outgroup, X; Loschbour, Stuttgart) / 
f40utgroup, X; Loschbour, NE). A complication is that BedouinB is a mixture of NE and 

23 

African ancestry. We therefore subtracted the effects of African ancestry using estimates of the 
BedouinB African admixture proportion from ADMIXTURE (SI9) or ALDER 76 . 

Admixture graph modeling 

We used ADMIXTUREGRAPH 5 (version 3110) to model population relationships between 
Loschbour, Stuttgart, Onge, and Karitiana using Mbuti as an African outgroup. We assessed 
model fit using a block jackknife of differences between estimated and fitted /-statistics for the 
set of included populations (we expressed the fit as a Z score). We determined that a model 
failed if IZI>3 for at least one /-statistic. A basic tree model failed and we manually amended the 
model to test all possible models with a single admixture event, which also failed. Further 
manual amendment to include 2 admixture events resulted in 8 successful models, only one of 
which could be amended to also fit MAI as an additional constraint. We successfully fit both 
the Iceman and LaBrana into this model as simple clades and Motalal2 as a 2-way mixture. We 
also fit present-day West Eurasians as clades, 2-way mixtures, or 3-way mixtures in this basic 
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model, achieving a successful fit for a larger number of European populations (n=26) as 3-way 
mixtures. We estimated the individual admixture proportions from the fitted model parameters. 
To test if fitted parameters for different populations are consistent with each other, we jointly fit 
all pairs of populations A and B by modifying ADMIXTUREGRAPH to add a large constant 
(10,000) to the variance term/^Ao, A, B). By doing this, we can safely ignore recent gene flow 
within Europe that affects statistics that include both A and B. 

Ancestry estimates from //-ratios 

We estimate EEF ancestry using the f 4 -vatio 5AS f 4 (Mbuti, Onge; Loschbour, European) I f 4 (Mbuti, 
Onge; Loschbour, Stuttgart), which produces consistent results with ADMIXTUREGRAPH 
(SI14). We use f 4 ( Stuttgart, Loschbour; Onge MAI) I f 4 (Mbuti, MAI; Onge, Loschbour) to 
estimate Basal Eurasian admixture into Stuttgart. We use f 4 ( Stuttgart, Loschbour; Onge 
Karitiana) I f 4 ( Stuttgart, Loschbour; Onge MAI ) to estimate ANE mixture in Karitiana (Fig. 2B). 
We use f 4 (Test, Stuttgart; Karitiana, Onge) I f 4 (MAl, Stuttgart; Karitiana, Onge) to lower bound 
ANE mixture into North Caucasian populations. 

MixMapper analysis 

We carried out MixMapper 2.0 4 analysis, a semi- supervised admixture graph fitting technique. 
First, we infer a scaffold tree of populations without strong evidence of mixture relative to each 
other (Mbuti, Onge, Loschbour and MAI). We do not include European populations in the 
scaffold as all had significantly negative /^-statistics indicating admixture. We then ran 
MixMapper to infer the relatedness of the other ancient and present-day samples, fitting them 
onto the scaffold as 2- or 3-way mixtures. The uncertainty in all parameter estimates is measured 
by block bootstrap resampling of the SNP set (100 replicates with 50 blocks). 

TreeMix analysis 

We applied TreeMix 14 to Loschbour, Stuttgart, Motalal2, and MAI 6 , LaBrana 12 and the 
Iceman 19 , along with the present-day samples of Karitiana, Onge and Mbuti. We restricted the 
analysis to 265,521 Human Origins array sites after excluding any SNPs where there were no- 
calls in any of the studied individuals. The tree was rooted with Mbuti and standard errors were 
estimated using blocks of 500 SNPs. We repeated the analysis on whole-genome sequence data, 
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rooting with Chimp and replacing Onge with Dai since we did not have Onge whole genome 
sequence data 35 . We varied the number of migration events (m) between 0 and 5. 

Inferring admixture proportions with minimal modeling assumptions 

We devised a method to infer ancestry proportions from three ancestral populations (EEF, WHG, 
and ANE) without strong phylogenetic assumptions (SI17). We rely on 15 "non-West Eurasian" 
outgroups and study f 4 European, Stuttgart; On O2) which equals afi /^Loschbour, Stuttgart; On 
O2) + a(l-fi) f4MAl, Stuttgart; On O2) if European has l-a ancestry from EEF and /?, 1-/? 
ancestry from WHG and ANE respectively. This defines a system of ( 15 ) = 105 equations with 
unknowns a/?, a(l-/?), which we solve with least squares implemented in the function Isfit in R to 
obtain estimates of a and /?. We repeated this computation 22 times dropping one chromosome at 
a time 26 to obtain block jackknife 21 estimates of the ancestry proportions and standard errors, 
with block size equal to the number of SNPs per chromosome. We assessed consistency of the 
inferred admixture proportions with those derived from the ADMIXTUREGRAPH model based 
on the number of standard errors between the two (Extended Data Table 1). 

Haplotype-based analyses 

77 

We used RefinedlBD from BEAGLE 4 with the settings ibdtrim=20 and ibdwindow=25 to 

42 

study IBD sharing between Loschbour and Stuttgart and populations from the POPRES dataset . 
We kept all IBD tracts spanning at least 0.5 centimorgans (cM) and with a LOD score >3 (SI18) 

'in 

We also used ChromoPainter to study haplotype sharing between Loschbour and Stuttgart and 
present-day West Eurasian populations (SI19). We identified 495,357 SNPs that were complete 

77 

in all individuals and phased the data using Beagle 4 with parameters phase-its=50 and impute- 
its=\0. We did not keep sites with missing data to avoid imputing modern alleles into the ancient 
individuals. We combined ChromoPainter output for chromosomes 1-22 using 

37 37 

ChromoCombine . We carried out a PCA of the co-ancestry matrix using fineSTRUCTURE . 
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